AITopics | support feature

Hybrid Mamba for Few-Shot Segmentation

Neural Information Processing SystemsMar-21-2026, 10:55:05 GMT

Many few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. A recent advance Mamba can also well capture intra-sequence dependencies, yet the complexity is only linear. Hence, we aim to devise a cross (attention-like) Mamba to capture inter-sequence dependencies for FSS. A simple idea is to scan on support features to selectively compress them into the hidden state, which is then used as the initial hidden state to sequentially scan query features. Nevertheless, it suffers from (1) support forgetting issue: query features will also gradually be compressed when scanning on them, so the support features in hidden state keep reducing, and many query pixels cannot fuse sufficient support features; (2) intra-class gap issue: query FG is essentially more similar to itself rather than to support FG, i.e., query may prefer not to fuse support features but their own ones from the hidden state, yet the success of FSS relies on the effective use of support information. To tackle them, we design a hybrid Mamba network (HMNet), including (1) a support recapped Mamba to periodically recap the support features when scanning query, so the hidden state can always contain rich support information; (2) a query intercepted Mamba to forbid the mutual interactions among query pixels, and encourage them to fuse more support features from the hidden state. Consequently, the support information is better utilized, leading to better performance. Extensive experiments have been conducted on two public benchmarks, showing the superiority of HMNet. The code is available at https://github.com/Sam1224/HMNet.

artificial intelligence, proceedings, support feature, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.84)

Add feedback

HybridMambaforFew-ShotSegmentation

Neural Information Processing SystemsFeb-16-2026, 09:04:59 GMT

Manyfew-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity.

artificial intelligence, machine learning, segmentation, (18 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Few-ShotSegmentationviaCycle-Consistent Transformer

Neural Information Processing SystemsFeb-10-2026, 21:12:44 GMT

In this paper, we focus on utilizing pixel-wise relationships between support and query images to facilitate the few-shot segmentation task.

machine learning, natural language, segmentation, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning

Li, Wenhao, Wang, Qiangchang, Meng, Xianjing, Wu, Zhibin, Yin, Yilong

arXiv.org Artificial IntelligenceOct-24-2025

Few-shot learning (FSL) aims to recognize novel concepts from only a few labeled support samples. Recent studies enhance support features by incorporating additional semantic information or designing complex semantic fusion modules. However, they still suffer from hallucinating semantics that contradict the visual evidence due to the lack of grounding in actual instances, resulting in noisy guidance and costly corrections. To address these issues, we propose a novel framework, bridging Vision and Text with LLMs for Few-Shot Learning (VT-FSL), which constructs precise cross-modal prompts conditioned on Large Language Models (LLMs) and support images, seamlessly integrating them through a geometry-aware alignment. It mainly consists of Cross-modal Iterative Prompting (CIP) and Cross-modal Geometric Alignment (CGA). Specifically, the CIP conditions an LLM on both class names and support images to generate precise class descriptions iteratively in a single structured reasoning pass. These descriptions not only enrich the semantic understanding of novel classes but also enable the zero-shot synthesis of semantically consistent images. The descriptions and synthetic images act respectively as complementary textual and visual prompts, providing high-level class semantics and low-level intra-class diversity to compensate for limited support data. Furthermore, the CGA jointly aligns the fused textual, support, and synthetic visual representations by minimizing the kernelized volume of the 3-dimensional parallelotope they span. It captures global and nonlinear relationships among all representations, enabling structured and consistent multimodal integration. The proposed VT-FSL method establishes new state-of-the-art performance across ten diverse benchmarks, including standard, cross-domain, and fine-grained few-shot learning scenarios. Code is available at https://github.com/peacelwh/VT-FSL.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.25033

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

86fe62e3b315d2578721562d9fd1a433-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 08:29:53 GMT

mamba, segmentation, support feature, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

b8b12f949378552c21f28deff8ba8eb6-Supplemental.pdf

Neural Information Processing SystemsAug-17-2025, 01:52:15 GMT

artificial intelligence, machine learning, segmentation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

b8b12f949378552c21f28deff8ba8eb6-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 01:52:11 GMT

machine learning, natural language, segmentation, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)
(2 more...)

Add feedback

Hybrid Mamba for Few-Shot Segmentation

Neural Information Processing SystemsMay-27-2025, 07:36:40 GMT

Many few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. A recent advance Mamba can also well capture intra-sequence dependencies, yet the complexity is only linear. Hence, we aim to devise a cross (attention-like) Mamba to capture inter-sequence dependencies for FSS. A simple idea is to scan on support features to selectively compress them into the hidden state, which is then used as the initial hidden state to sequentially scan query features. Nevertheless, it suffers from (1) support forgetting issue: query features will also gradually be compressed when scanning on them, so the support features in hidden state keep reducing, and many query pixels cannot fuse sufficient support features; (2) intra-class gap issue: query FG is essentially more similar to itself rather than to support FG, i.e., query may prefer not to fuse support features but their own ones from the hidden state, yet the success of FSS relies on the effective use of support information. To tackle them, we design a hybrid Mamba network (HMNet), including (1) a support recapped Mamba to periodically recap the support features when scanning query, so the hidden state can always contain rich support information; (2) a query intercepted Mamba to forbid the mutual interactions among query pixels, and encourage them to fuse more support features from the hidden state.

few-shot segmentation, mamba, support feature, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.87)

Add feedback

Memory-guided Network with Uncertainty-based Feature Augmentation for Few-shot Semantic Segmentation

Chen, Xinyue, Shi, Miaojing

arXiv.org Artificial IntelligenceJun-9-2024

The performance of supervised semantic segmentation methods highly relies on the availability of large-scale training data. To alleviate this dependence, few-shot semantic segmentation (FSS) is introduced to leverage the model trained on base classes with sufficient data into the segmentation of novel classes with few data. FSS methods face the challenge of model generalization on novel classes due to the distribution shift between base and novel classes. To overcome this issue, we propose a class-shared memory (CSM) module consisting of a set of learnable memory vectors. These memory vectors learn elemental object patterns from base classes during training whilst re-encoding query features during both training and inference, thereby improving the distribution alignment between base and novel classes. Furthermore, to cope with the performance degradation resulting from the intra-class variance across images, we introduce an uncertainty-based feature augmentation (UFA) module to produce diverse query features during training for improving the model's robustness. We integrate CSM and UFA into representative FSS works, with experimental results on the widely-used PASCAL-5$^i$ and COCO-20$^i$ datasets demonstrating the superior performance of ours over state of the art.

feature statistics, query feature, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2406.00545

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pinaki Laskar on LinkedIn: #selfdrivingcars #autonomousdriving #autonomouscars

#artificialintelligenceMay-11-2022, 13:36:06 GMT

AI Researcher, Cognitive Technologist Inventor - AI Thinking, Think Chain Innovator - AIOT, XAI, Autonomous Cars, IIOT Founder Fisheyebox Spatial Computing Savant, Transformative Leader, Industry X.0 Practitioner What do you think of the update to the SAE's levels of autonomous driving? Do you find these levels helpful when it comes to knowing what an AV can do? What's the difference between driver support features and automated driving? Society of Automotive Engineers (SAE) recognise that levels 0-2 are better defined as'driver support features.' Level 3 and above encompass what they would now refer to as'automated driving features.' a six degrees of automated driving: from zero automation to full automation.

automation, autonomouscar, selfdrivingcar, (6 more...)

#artificialintelligence

Genre: Personal > Interview (0.60)

Industry: